Maximum compute and memory efficiency
A new level of intelligence for mobile and IoT devices
A new level of intelligence for mobile and IoT devices
Frontier intelligence on personal computers
Build autonomous agents that plan, navigate apps, and complete tasks on your behalf, with native support for function calling.
Develop applications with strong audio and visual understanding, for rich multimodal support.
Create multilingual experiences that go beyond translation and understand cultural context.
Improve performance for specific tasks by training Gemma using your preferred frameworks and techniques.
Run models on your own hardware for efficient development and deployment.
| Benchmark | Gemma 4 31B IT Thinking | Gemma 4 26B A4B IT Thinking | Gemma 4 E4B IT Thinking | Gemma 4 E2B IT Thinking | Gemma 3 27B IT | |
|---|---|---|---|---|---|---|
| Arena AI (text) As of 4/2/26 | 1452 | 1441 | — | — | 1365 | |
| MMMLU Multilingual Q&A | No tools | 85.2% | 82.6% | 69.4% | 60.0% | 67.6% |
| MMMU Pro Multimodal reasoning | 76.9% | 73.8% | 52.6% | 44.2% | 49.7% | |
| AIME 2026 Mathematics | No tools | 89.2% | 88.3% | 42.5% | 37.5% | 20.8% |
| LiveCodeBench v6 Competitive coding problems | 80.0% | 77.1% | 52.0% | 44.0% | 29.1% | |
| GPQA Diamond Scientific knowledge | No tools | 84.3% | 82.3% | 58.6% | 43.4% | 42.4% |
| τ2-bench Agentic tool use | Retail | 86.4% | 85.5% | 57.5% | 29.4% | 6.6% |
These models were evaluated against a large collection of datasets and metrics to cover different aspects of text generation. See additional benchmarks in model card.
Gemma 4 models undergo the same rigorous infrastructure security protocols as our proprietary models. By choosing Gemma 4, enterprises and sovereign organizations gain a trusted, transparent foundation that delivers state-of-the-art capabilities while meeting the highest standards for security and reliability.
Explore how others build with Gemma